7 Variational Autoencoders
7.1 Training
Variational encoders are separately trained for each bird. To determine the optimal number of embedding dimensions, I calculated the Calinski-Harabasz index, or the ratio of the between-cluster variance to the within-cluster variance, using the pre-labelled clusters (fig 7.1). Bird 7358 (66-68 DPH) has relatively stable syllables and song syntax, while bird 6951 (59-63 DPH) has more variable syllables and syntax 8.1. For bird 7358, little information is gained beyond 32 dimensions.
Figure 7.1: Reconstruction loss and Calinski-Harabasz Index.
Figure 7.2: Input (left) and decoded (right) syllables.
Figure 7.3: Traversing the embedding space from the centroid of syllable āiā to each other syllable centroid.
